Characterizing Farmers and Farming system in Kilombero Valley Floodplain,Tanzania
Characterizing Farmers and Farming system in Kilombero Valley Floodplain,Tanzania
#clear the workspace
rm(list = ls())
# Read all the data sets to R
list.filenames <- list.files(path = here("Data"), pattern = ".dta")
# create an empity list to
list.data <- list()
# loop through the file names and the data
for(i in 1:length(list.filenames)){
list.data[[i]] <- as_factor(haven::read_dta(paste0("Data/",list.filenames[i])), only_labelled = TRUE)
#list.data2[[i]] <- foreign::read.dta(list.filenames[i], convert.factors = TRUE)
}
# attache file names to the list
names(list.data) <- list.filenames
#global rounding function
round_df <- function(x, digits) {
# round all numeric variables
# x: data frame
# digits: number of digits to round
numeric_columns <- sapply(x, class) == 'numeric'
x[numeric_columns] <- round(x[numeric_columns], digits)
x
}
myColors <- list( "#2f1a5c",
"#5f1d65",
"#8a2168",
"#b12d64",
"#d1435c",
"#e96050",
"#f98143",
"#ffa538")
myColors2 <- list("#26547C",
"#EF476F",
"#FFD166",
"#06D6A0",
"#454545",
"#9E2A2B",
"#E09F3E",
"#306B34",
"#5603AD",
"#52154E",
"#20A4F3",
"#0F8B8D",
"#F26419")
mycolor3 <- c("#19619d",
"#208a4b",
"#665191",
"#a05195",
"#b01255",
"#ffa600",
"#0e2759",
"#ff7c43"
)1 Introduction
This notebook contains the code and result of analysis conducted to characterize farmers and farming systems in Kilombero valley flood plain. The notebook is structured in to four main section. The first section provides the details of the data collection methods and instrument used for this study. The And the second section will report the main socio-economic characteristics of surveyed farmers. The third section presents the result of the typology analysis from this survey. The fourth section present a second typology study based on the 2007 Agriculture sample survey of Tanzanian Government for validation and stability of the clusters emerged from our own data.
2 Data Source
The main data source is based on a household survey done in Kilombero floodplain in Tanzania between November and December 2015. As part of the project GLOBE -“Reconciling future food production and environmental sustainability in East African wetlands”. The surveys were carried out within 21 villages in two Districts (Ulanga and Kilombero) of the Kilombero Valley. In total 304 farm households were interviewed, giving their opinions upon a wide range of topics designed to discover the farming system in terms of resource availability and use, livelihood source.[For this specific survey a farm household is defined as individuals who live together, share meals and pool some or all of their income, and who cultivate land or keep livestock.]
library(rgdal)
library(leaflet)
shapeData <- readOGR("StudyAreaMap/StudyArea.shp", "StudyArea", verbose = FALSE)
shapeData <- spTransform(shapeData, CRS("+proj=longlat +datum=WGS84 +no_defs"))
cities<- read.csv2(here("Data_csv", "wardCenterPoint.csv"))
leaflet(shapeData) %>%
addProviderTiles(providers$Esri.WorldImagery)%>%
addProviderTiles(providers$OpenStreetMap, options = providerTileOptions(opacity = 0.2))%>%
setView( lat = -8.350750, lng = 36.700436,zoom=9)%>%
addPolygons(weight = 1, col ="white", fillColor = topo.colors(10, NULL),
highlight = highlightOptions(color = "black", weight = 3, bringToFront = TRUE), label = paste0(shapeData$WARD_NAME," , ", "Farm Population: ", shapeData$NoFarmHH12) )%>%
addCircles(data = cities, lng = ~lng, lat = ~lat, weight = 1,
radius = ~sqrt(cities$NoHH2012) * 50)%>%
addCircles(data = cities, lng = ~lng, lat = ~lat, weight = 1,
radius = ~sqrt(cities$NoFHH12_2) * 50, color = "red")The selection of households to be interviewed was based on a multi stage sampling strategy. In the first stage 12 wards were selected purposely based on the availability of floodplain farming. In the second stage 21 villages were selected randomly within the wards. In the final stage households were selected randomly from the list provided by each villages leader. The number of interviewees per village ranges from 5 in smaller villages to 15 in the biggest. A GIS coverage incorporating the land use map form GLC30 and the administrative boundary and census data from Tanzania statistics office was use to estimate the boundaries and total population size in the study area. The primary data was collected using a standard questionnaire which solicited information on aspects of rural livelihoods such as demographic details of a household; land use, land ownership and acreage, labor use, physical quantities of crop outputs as well as household patterns concerning their use as food or income source, ownership of various assets, responses to shocks, the embeddedness of households in social networks and institutions, future prospects and plans. The questionnaire was administered on one to one interview basis administered by 5 well trained (University graduates from SOKONE University of agriculture) enumerators who understand the farming context and the local language. The questioner was originally written in English and translated to Swahili during the interview by field assistants. A pre-test survey was also conducted in order to assess the understanding of the field assistants to administor the questionnaire and also to see how farmers understand the questions asked. And some questions and potential answers were modified to the understanding of the villagers.
3 Socio-economic characterstics of Households
3.1 Household Demographics
Demographic variables examined include age, sex, and marital status of the household head, family size and composition, and the level of education of households. Households surveyed were predominately headed by male with only 16 percent headed by female. Household heads were, on average, 46 years old and 77 percent of them married. The women who heads the households are mostly (92 percent) are either widowed, divorced or separated. Household heads education level were low with 7 percent of them lacking any formal education and an additional 83 percent having only completed primary school. The average household size for the entire sample was 5 (SD=2.18, n=304) with a minimum of 2 members and a maximum of 11 members. Forty-four percent of respondents have a family size of less than 4 members, which can be considered as a small family. And 41 percent are medium sized with 5-8 numbers of members. 12 percent of households in the sample are extended families, with more than 8 members
sum <-
summarytools::freq(list.data$combined4.1.dta$EducationLevel, order = "freq")
sum <- sum[-5, ]
DT::datatable(
round(sum, 3),
options = list(bPaginate = FALSE),
caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 1: ',
htmltools::em('Education of Household Head.')
)
)sum2 <-
summarytools::freq(list.data$combined4.1.dta$Age_Class, order = "freq")
sum2 <- sum2[-5, ]
DT::datatable(
round(sum2, 3),
options = list(bPaginate = FALSE),
caption = htmltools::tags$caption(style = 'caption-side: top; text-align: center;',
'Table 2: ', htmltools::em('Age class of Household Head.'))
)df <- list.data$combined4.1.dta$gender
df <- as.data.frame(df)
df2 <- df%>%
group_by(df)%>%
dplyr::summarise(counts=n())%>%
mutate(Percent = round(counts*100/sum(counts), 1))
plot_ly(df2, labels= ~df, values= ~counts, marker = list(colors = mycolor3,
line = list(color = '#FFFFFF', width = 1)))%>%
add_pie() %>%
layout(title = "Gender of Household Head", showlegend =T,
xaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE),
yaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE))x <- list.data$combinedNew.dta$HHsize
fit<- density(x)
plot_ly(x = x, type = "histogram",histnorm = "probability", name = "HH_size %") %>%
add_trace(x = fit$x, y = fit$y, type = "scatter", mode = "lines", fill = "tozeroy", yaxis = "y2", name = "HH_size_Density") %>%
layout(yaxis2 = list(overlaying = "y", side = "right"),title = "Distribution of Household size", showlegend =T,
xaxis = list(title="Household Size",showgrid = FALSE, zeroline = FALSE),
yaxis = list(showgrid = FALSE, zeroline = FALSE))3.2 Livelihood and diversification
3.2.1 On farm livelihood source
Most of the households in the surveyed villages obtain their livelihood from agriculture. Crop production mainly rice and maize are the most important crops both for home consumption and income generation. Small numbers of households, specifically recently migrated pastorals, also integrate crops production with livestock rearing.
Given the agricultural small-holder producers in the study area are semi-subsistence farm households, part of the total product is retained within the household for home consumption. The remainder is sold on the market. For example, Almost 80 percent of rice producing farmers reported selling on average 58 percent of their rice harvests to the market to cover the costs of inputs and basic household needs. However, in most of the villages that are far from the nearest big market where the milling services are located, farmers usually sell their rice harvest and buy back again the milled rice from small traders at a price almost double to their selling price. On average households engage in crop pro- duction in the wetlands get gross income of 640 Thousands TZs per year per hectare. Of these 40 percent account for home consumption.
3.2.2 Off farm livelihood sources
Although income from farming is the dominant livelihood strategy for the majority of the farmers, 26 percent of the households have reported that they received some form of off-farm income during the year. The most common sources for off farm income in the area include remittance, rental of land, brick selling and small business shops. 14 percent of the survey households have received income from business primarily from small retail stores and transportation services [Bajaji and bodaboda]. Seven percent of the households have engaged in bricks production receiving on average gross revenue of 64, 763 TZs per year (often by farmers residing close to Ifekara). Contrary to the expectations, majority of farm households surveyed not engaged in sale of fish for cash generation in the study area. However, this doesn’t mean that there is no fishing activity in the valley. The valley actually supports a number of households by providing income source from fishing. The fishing activity is usually done by a marginalize group of fisheries who are solely based their livelihood on fishing.
# Income by source
income <- list.data$Income.dta
income1 <- income[,c(4:9)]
income_1 <- reshape2::melt(income1)
df <- income_1%>%
group_by(variable)%>%
dplyr::summarise(Avarage=mean(value, na.rm = T))
plot_ly(df, labels = ~variable, values = ~Avarage,
marker = list(colors=mycolor3,
line = list(color = '#FFFFFF', width = 1)))%>%
add_pie(hole = 0.6) %>%
layout(title = "Annual Household Income by Source", showlegend =T,
xaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE),
yaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE))income <- list.data$Income.dta$TotalHouseholdIncome
income <- remove_missing(as.data.frame(income))
# Distribution of Income
income_2 <- income
income_2 <- as.data.frame(income_2)
x<- (income_2$income_2/1000)+0.001
x <- x[which(x < 20000)]
# fit gama distribution
x <- log(x)
fit<- density(x)
plot_ly(x = x, type = "histogram",histnorm = "probability", name = "Log Income") %>%
add_trace(x = fit$x, y = fit$y, type = "scatter", mode = "lines", fill = "tozeroy", yaxis = "y2", name = "Density") %>%
layout(yaxis2 = list(overlaying = "y", side = "right", tickformat= ',.0%'),title = "Distribution of Household Income", showlegend =T,
xaxis = list(title="Log Income",showgrid = FALSE, zeroline = FALSE),
yaxis = list(showgrid = FALSE, zeroline = FALSE, tickformat= ',.0%'))gini=function(x){
n=length(x)
mu=mean(x)
g=2/(n*(n-1)*mu)*sum((1:n)*sort(x))-(n+1)/(n-1)
return(g)
}
boot=function(x,f,b=500){
n=length(x)
F=rep(NA,n)
for(s in 1:b){
idx=sample(1:n,size=n,replace=TRUE)
F[s]=f(x[idx])}
return(F)}G<-boot(income,gini,1000)
plot_ly(x=~G, type="histogram",
histnorm = "probability", name="GI")%>%
add_segments(x=0.523, xend=0.669, y= 0.01, yend = 0.01, name="90% CI", linewidth=4)%>%
layout(title="Gini Index based on 1000 bootstrap resampling", xaxis = list(title = "Gini Index", boundmode='soft'), yaxis=list(tickformat= ',.0%'))z<- quantile(G,c(.05,.95))
x <- Lc(income)
p <- x$p
L <- x$L
lp <-x$L.general
data <- data.frame(p, L, lp)
data <- round_df(data, 4)
ax <- list(
showline = TRUE,
mirror = "ticks",
gridcolor = toRGB("gray25"),
gridwidth = 0,
linecolor = toRGB("gray25"),
linewidth = 0,
title="Cumulative Proportion of Households"
)
ay <- list(
showline = TRUE,
mirror = "ticks",
gridcolor = toRGB("gray25"),
gridwidth = 0,
linecolor = toRGB("gray25"),
linewidth = 0,
title="Cumulative Proportion of Income"
)
plot_ly(data, x = ~p, y = ~L, name="Lorenz curve", type = 'scatter', mode = 'lines')%>%
add_trace(y=~p, name="Line of Equality")%>%
layout(title="Lorenz Curve of Farmers in KVFP",xaxis=ax, yaxis=ay)3.2.3 Food Security
#foodsecurity <- as_factor(haven::read_dta(here("Data", "FoodSecurityFinal.dta"), "labels"))
foodsecurity1 <- list.data$FoodSecurityFinal.dta
farmType <- read.csv(here("Data_csv", "DataWithCluster.csv"))
farmType <- farmType[,c(16,17)]
foodsecurity <- merge(foodsecurity1, farmType, by="hhid")
foodsecurity$clust <- factor(foodsecurity$clust)
x.dens <- density(foodsecurity$PercapitaDailyEnergyAcquistion)
df.dens <- data.frame(x = x.dens$x, y = x.dens$y)
g <- ggplot(foodsecurity, aes(foodsecurity$PercapitaDailyEnergyAcquistion))+ stat_density(alpha = 0.7 )+ labs(x="Kcal/person/day", y="PDF of the dietary energy consumption", title="Distribution of dietary energy consumption")+
geom_vline(aes(xintercept=2731),color="blue", linetype="dashed", size=0.5)+
geom_area(data = subset(df.dens, x <= 2731), aes(x=x,y=y), fill="#edb942")+
theme_light()+ theme(legend.position="none")
a <- list(
x = 2794,
y = 0.00025,
text = "Minimum Dietary energy requirement<br> 2730 Kcal/person/day",
xref = "x",
yref = "y",
showarrow = TRUE,
arrowhead = 25,
ax = 250,
ay = -20
)
ggplotly(g)%>%
layout(annotations = a, titlefont = list(family = "sans-serif"))3.2.4 Access to credit
33 percent of the surveyed households obtained credits from different institutional and non institutional sources. The National Agriculture Census (2009) have even yielded smaller findings corroborating that most of the farmers in Kilombero and Ulanaga district have limited access to credit with only 2.4 percent of households 2.5 percent had access to credit The sources of credit for households include people credit funds (Savings and Credit Cooperatives (SACCOs)), commercial banks, and village lenders. People credit funds were the main source of formal credit (52 percent) due to the convenience and flexibility of payment terms despite the high interest rates they charge. The largest proportion (63 percent) of the credit is channeled to accessing inputs. During the field visit it was observed that farmers also engage in contractual farming with small traders. In this arrangement, traders will provide the cash require for different expenditures during the planting season and a farmer agrees to sell his potential rice harvest based on the price they currently agreed on.
Source of Credit
df2 <- as.data.frame(list.data$HouseholdHeadCxs.dta$creditorg)
df2 <- as.data.frame(remove_missing(df2))
names(df2)[1] <- "creditorg"
df <- df2%>%
group_by(creditorg)%>%
dplyr::summarise(counts=n())%>%
arrange(desc(creditorg)) %>%
mutate(prop = round(counts*100/sum(counts), 1),
lab.ypos = cumsum(prop) - 0.5*prop)
levels(df$creditorg)[11] <- "Group"
plot_ly(df, labels= ~creditorg, values= ~counts, opacity=0.9,
marker = list(colors=mycolor3,
line = list(color = '#FFFFFF', width = 1)),textposition = 'auto',
textinfo = 'label+percent',
insidetextfont = list(color = '#FFFFFF'))%>%
add_pie() %>%
layout(title = "Source of Credit", showlegend =F,
xaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE),
yaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE))Purpose of Credit
df2 <- as.data.frame(list.data$HouseholdHeadCxs.dta$Purp1)
df2 <- as.data.frame(remove_missing(df2))
names(df2)[1] <- "purpose"
df <- df2%>%
group_by(purpose)%>%
dplyr::summarise(counts=n())%>%
arrange(desc(purpose)) %>%
mutate(prop = round(counts*100/sum(counts), 1),
lab.ypos = cumsum(prop) - 0.5*prop)
plot_ly(df, labels= ~purpose, values= ~counts,
marker = list(colors=mycolor3,
line = list(color = '#FFFFFF', width = 1)))%>%
add_pie(hole = 0.6) %>%
layout(title = "Main Reason for Borrowing", showlegend =T,
xaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE),
yaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE))3.3 Resource Endowment
3.3.1 Land
For most of sub Saharan African countries Land is the most important asset and re ects economic situation, power, prestige and security of any farmer or farm household. The amount of land to which a household has access and the terms on which it utilizes that land are factors that in uence, if not determine, its decisions about the strategies adopted in utilizing land resources to earn a livelihood. The average farm size in the study area was 2.6 hectares (sd= 2.8) with a maximum of 21 hectares. As the figure below shows almost 55 percent of the households own less than 2 hectares of land. Farmers typically own multiple plots with 62 percent of them owning two or more plots. Usually one plot with the largest size and in the seasonally ooded area will be used for rice and/or maize production and the smaller plots are often where the homesteads are located and households plant some vegetables for home consumption. The figure above also show, farmers are engaged in multi season farming or land rental market which results their actual planted area to be higher than land owned. The figure below also indicates the relationship between farm size and household size. The fractional polynomial regression graph shows that larger families are likely to have larger farm sizes. This might be the result of one, large families require more land to produce enough food to the household and also given the large size of the member its easier to manage large farmers.
df<-list.data$`number of fields.dta`
df$parcel <-as.factor(df$parcel)
df2 <- df%>%
group_by(parcel)%>%
dplyr::summarise(counts=n())%>%
arrange(desc(parcel)) %>%
mutate(prop = round(counts*100/sum(counts), 1),
lab.ypos = cumsum(prop) - 0.5*prop)
plot_ly(df2, x = ~parcel, y = ~prop, type = 'bar',text = ~round(prop,2), textposition = 'outside',textfont = list(color = 'black', size=12), marker = list(color = myColors2)) %>%
layout(title = "Number of Parcels Owned",
xaxis = list(title = "Number of Parcels"),
yaxis = list(title = "Percent of Households")) df <- list.data$LandUseAndLandshare.dta
df <- df[,c(7,14)]
df2 <- reshape2::melt(df)
gg<-ggplot(df2,aes(x = value)) + stat_ecdf(aes(colour = variable))+guides(fill=guide_legend(title = NULL))+
labs(x="Farm Size in Ha", y="ecd", title="Empirical cumulative distribution of farm size and Planted area")+theme_light()+scale_fill_economist()+
theme(legend.title = element_blank())
ggx<- list.data$combinedNew.dta$HHsize
y <- list.data$LandUseAndLandshare.dta$FarmSize_Ha
df <- data_frame(x,y)
c <- ggplot(df, aes(x, y))+ stat_smooth()+labs(x="HH Size", y="Farm size_ha", title =" Relation between Farm Size and Household Size") +theme_light()
ggplotly(c)x<- list.data$combinedNew.dta$HHsize
y <- list.data$LandUseAndLandshare.dta$FarmSize_Ha
z <- list.data$combined4.1.dta$district
df <- data_frame(x,y,z)
df2 <- reshape2::melt(df)
levels(df2$variable)<-c("HHsize", "Farm_Size")
plot_ly(df2, x=df2$z, y=df2$value, color =~ df2$variable, type="box", colors = "Dark2")%>%
layout(boxmode = "group",title = "Box plot of HHsize and Farm size",
xaxis = list(title = "District"),
yaxis = list(title = "Percent of Households")) Like most Sub-Saharan African countries, the livelihood of most of the farmers in Kilombero valley is intimately tied to the land. And the security of land ownership is a critical factor for well functioning of the economic, social or environmental make up of the area. Looking at the land ownership, in general land is a property of the state in Tanzania. However, there are usually customary rules that govern access and ownership of the land. 80 percent of the farmers surveyed reported that they own the land without any deeds. And the remaining reported either its owned by family, or rented or owned with deeds. However, communal farm land is not common in the area. The demand for farmland in the wetland has been increasing in the resent years. With an increase in rainfall volatility, frequent occurrence of extreme weather events and immense livelihood potential of wetlands have attracted farmers to acquire land in the valley from different parts of the country. The survey result shows that majority of the farmers acquired their wetland farm in the past 10 to 15 years. The following density plot of farmers starting time of wetland use shows that household has been farming the wetland for over 40 years, however majority of the farmers acquired their farm in the past two three decades. There has been a surge in the number of farmers who started using the wetland for cropping activity in late 1980 and beginning of 2000.
wetstart <- as.data.frame(list.data$TextQuestion.dta$wetstart)
names(wetstart)[1]<- "WetStart"
wetstart2 <- wetstart%>%
group_by(WetStart)%>%
dplyr::summarise(NumberOfEntrant = n())%>%
arrange(WetStart)%>%
mutate(cum_frequency = cumsum(NumberOfEntrant))
scaleFactor <- max(wetstart2$NumberOfEntrant/max(wetstart2$cum_frequency))
p<-ggplot(wetstart2, aes(x = WetStart))+
geom_smooth(aes(y = NumberOfEntrant, method="loess"))+
geom_smooth(aes(y = cum_frequency * scaleFactor),method = "loess", col="red")+
scale_y_continuous(name="Number of Entrants", sec.axis=sec_axis(~./scaleFactor, name="Cummulative Frequency"))
ggplotly(p)%>%
layout(title="Entry to Wetland Farming")There are eight different modes farmers access and acquire land in the study area. Overall, one third of all the respondents acquired their land through inheritance from their parents , while 22 percent and 14 percent of farm households acquire their land through purchasing and occupying (which through clearing bush land or forest) re- spectively. Some farmers also acquired the plot by ceding either from the district government office or the village authority. Give different economic, institutional and biophysical constraint farmers face in the study area, most of the households didn’t expand their farm size in the past five years. Only eight percent of the surveyed house- holds have expanded their land and fifteen percent of the households have planned to expand their land.
Tenure
df<-list.data$TotalPlantedAreaHa.dta$tenure
df<- as.data.frame(df)
df$df <-as.factor(df$df)
df <- remove_missing(df)
df<- as.data.frame(df)
df2 <- df%>%
group_by(df)%>%
dplyr::summarise(counts=n())%>%
arrange(desc(df)) %>%
mutate(prop = round(counts*100/sum(counts), 1),
lab.ypos = cumsum(prop) - 0.5*prop)
df2 <- df2[which(df2$prop > 1),]
plot_ly(df2, x = ~df, y = ~prop, type = 'bar', text = ~round(prop,2), textposition = 'outside',textfont = list(color = 'black', size=12), marker = list(color = myColors2)) %>%
layout(title = "Land Tenure",
xaxis = list(title = "Land tenure"),
yaxis = list(title = "Percent of Households")) Land Acquisition
df <- list.data$ParcelHistory.dta$Landacqr
df <- as.data.frame(df)
df <- remove_missing(df)
df2 <- df%>%
group_by(df)%>%
dplyr::summarise(counts=n())%>%
arrange(desc(df)) %>%
mutate(prop = round(counts*100/sum(counts), 1),
lab.ypos = cumsum(prop) - 0.5*prop)
plot_ly(df2, y = ~df, x = ~prop, type = 'bar', text = ~round(prop,2), textposition = 'outside',textfont = list(color = 'black', size=12),orientation = 'h', color = ~df, colors = "Paired") %>%
layout(title = "Land Access mode",
xaxis = list(title = "Percent of Households"),
yaxis = list(title = ""), margin=list(l = 200, r = 10, t = 70, b = 80), showlegend=FALSE) 3.3.2 Labour
Having sufficient labor is one of the determinate factors for livelihood of households in the valley. Labor is provided either by household members or hired from the local labor pool. The survey result shows that hiring and exchange of labor is common in the area. 94 percent of surveyed households have hired laborers to help with different stages of cultivation, majority being hired during land preparation and cultivation stages during the 2015 cropping season. The suppliers of labor are either from the local labor pool or migrants who comes from different parts of Tanzania during the farming season. Although almost all farmers hire labor for cropping activity, the prime proportion of (measured Man days per year) is provided by family labor. On average 63 percent of the total man-day is provided by family labor and the remaining 37 percent is from hired labor. However, there is large variation between farmers some with entirely depending on hired labor or entirely with family labor.
3.3.3 Capital
Most farmers perform their bulk of farming activity using simple hand tools (like hoe, axe, digging fork etc). However, the use of tractor and draft animal during the initial land preparation stage is common in the valley. As most of the farmers cannot afford to individually own a tractor, they rely on the service of other tractor owners who will come to the valley during the land preparation period from all over the county. Fifty-seven percent of surveyed households hired a tractor for land preparation during the last cropping season. Hiring oxen is also common in the valley. Of the surveyed households, twenty eight percent hired oxen for the land preparation. Farmers prefer oxen hiring primarily for two reasons; one the demand for tractors are higher than the supply during the peak seasons and two most of the farms are located in the alluvial plain where there is no road for the tractor to make it to the farm. Usually migrant pastorals and agro-pastoral are the providers of the oxens with the price almost equal to the price for tractors. Nine percent of the households did not hire any farm machinery or oxen used traditional and manually operated tools.
All_Crops Land Preparation
df2 <- list.data$landpreparation3.dta$LandPrep
df3 <- list.data$landpreparation4.dta
df2<- as.data.frame(df2)
df <- df2%>%
group_by(df2)%>%
dplyr::summarise(counts=n())%>%
arrange(desc(df2)) %>%
mutate(prop = round(counts*100/sum(counts), 1),
lab.ypos = cumsum(prop) - 0.5*prop)
###############bar chart of land preparation method all crops
plot_ly(df, y = ~df2, x = ~prop, type = 'bar', text = ~round(prop,2), textposition = 'outside',textfont = list(color = 'black', size=12),name="All_Crops",orientation = 'h', marker = list(color = mycolor3)) %>%
layout(title = "Land Preparation Method",
xaxis = list(title = "Percent of Households"),
yaxis = list(title = ""), margin=list(l = 200, r = 10, t = 70, b = 80)) Rice Land Preparation
df_1 <- as.data.frame(df3$landprep_Rice)
df_1<- remove_missing(df_1)
df_1 <- as.data.frame(df_1)
names(df_1)[1] <- "LandPrep"
df <- df_1%>%
group_by(LandPrep)%>%
dplyr::summarise(counts=n())%>%
arrange(desc(LandPrep)) %>%
mutate(prop = round(counts*100/sum(counts), 1),
lab.ypos = cumsum(prop) - 0.5*prop)
##################################################
df_2 <- as.data.frame(df3$landprep_Maize)
df_2<- remove_missing(df_2)
df_2 <- as.data.frame(df_2)
names(df_2)[1] <- "LandPrep"
df_3 <- df_2%>%
group_by(LandPrep)%>%
dplyr::summarise(counts=n())%>%
arrange(desc(LandPrep)) %>%
mutate(prop = round(counts*100/sum(counts), 1),
lab.ypos = cumsum(prop) - 0.5*prop)
##################################################
###### bar plot of land preparation for Rice
plot_ly(df, x = ~LandPrep, y = ~prop, type = 'bar', text = ~round(prop,2), textposition = 'outside',textfont = list(color = 'black', size=12), color =~LandPrep, colors = mycolor3) %>%
layout(title = "Land Preparation Method for Rice cultivation",
xaxis = list(title = "Land Preparation Method"),
yaxis = list(title = "Percent of Households"))Maize Land Preparation
plot_ly(df_3, x = ~LandPrep, y = ~prop, type = 'bar', text = ~round(prop,2), textposition = 'outside',textfont = list(color = 'black', size=12),color=~LandPrep, colors = "Set1") %>%
layout(title = "Land Preparation Method for Maize",
xaxis = list(title = "Land Preparation Method"),
yaxis = list(title = "Percent of Households")) 3.3.4 Livestock
Although the two districts have experience surge in the number of cattle due to increase in pastorals and agro pastorals, most of the sedentary farmers are not engaged in extensive livestock farming. While sixty-three percent of the surveyed households have reported they own at least one livestock animal, the large majority of households keep small ruminants. Chickens are the most kept animals, with over 94 percent of livestock- keeping households raising at least one chicken or duck. Indigenous goats and sheep (24 percent), indigenous cattle (18 percent) and pigs (1 percent)
# box plot of livestock ownership
livestock <- list.data$livestockfinal.dta
livestock2 <- livestock[,c(2:7)]
livestock2<- reshape2::melt(livestock2)
livestock3 <- livestock2[which(livestock2$value >0),]
p1 <-plot_ly(livestock3, y = ~value, color = ~variable, type = "box", legendgroup=~variable)%>%
layout(title = "Livestock Ownership",
xaxis = list(title = "Livestock ownership(only those who owns)"),
yaxis = list(title = "number livestock"), margin=list(l = 30, r = 5, t = 50, b = 80))
p2 <-plot_ly(livestock2, y = ~value, color = ~variable, type = "box", showlegend=F)%>%
layout(title = "Livestock ownership(only those who owns)",
xaxis = list(title = "Livestock ownership"),
yaxis = list(title = ""), margin=list(l = 30, r = 5, t = 50, b = 80))
plotly::subplot(p2, p1, margin = .03, nrows = 1, titleY = TRUE, titleX = TRUE)The plot below shows, livestock-keeping households in Kilombero district and Ulanga district own an average of 0.88 and 4.8 Tropical Livestock Unit (TLU). While in Ulanga district, livestock farmers keep a relatively large herd, some households owning more that 100 cattle, Poultry and small ruminants are the most common species in Kilombero. Overall, livestock play a minor economic role in comparison to crop production in the valley.
#Tropical livestock unit
tlu <- list.data$livestockfinal.dta
tlu <- tlu[,c(8,11)]
tlu <- tlu[which(tlu$TLU > 0),]
sum.tlu <- tlu%>%
group_by(district)%>%
dplyr::summarise(n=n(),
mean=mean(TLU),
sd = sd(TLU),
Max = max(TLU),
MIn = min(TLU))
sum.tlu <- as.data.frame(sum.tlu)
DT::datatable(round_df(sum.tlu, 3),caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 3: ', htmltools::em('Distribution of livestock ownership by district.')), options = list(dom = 't'))3.4 Crop Production
3.4.1 Crop Choice and Land use
Paddy rice is the dominant crop cultivated in the area. Given the flooding of the area during the rainy season this is not as surprising. As we can see from figure 14, on average farmers allocate 80 percent of their land for rice production, 13 percent to maize. And some farmers also produce vegetables, cassava and other permanent crops and fruits. One of the mystifying characteristics of crop choices in the valley is the remarkable uniformity and similarity of farm units which produce the same crop for the same reason year after year. Temporary mono- crop production predominates, with paddy rice and maize being the most important enterprises. The farmers base their decision primarily on the land suitably, productivity and prices. Majority of the farmers claim that their farm plot is either not suitable for other crop cultivation or has the better productivity.
df<-list.data$landshare.dta
df2 <- reshape2::melt(df)
df3 <- df2%>%
group_by(variable)%>%
dplyr::summarise(Avarage= mean(value))
plot_ly(df3, x = ~variable, y = ~Avarage, type = 'bar', text = ~round(Avarage,2), textposition = 'outside',textfont = list(color = 'black', size=12), width = 0.5, marker = list(color = myColors2)) %>%
layout(title = "Household Land Use Share",
xaxis = list(title = "Crop Type"),
yaxis = list(title = "Avarage Percentage Share")) df <- list.data$Riceandmaize.dta
df <- df[,c(1,2)]
names(df)[1] <- "Maize Yield"
names(df)[2] <- "Rice Yield"
df2 <- reshape2::melt(df)
gg<-ggplot(df2,aes(x = value)) + stat_ecdf(aes(colour = variable, geom = "step"))+
geom_vline(xintercept = c(879,1040), color= c("red", "mediumturquoise"), linetype="dashed", size=0.7)+
guides(fill=guide_legend(title = NULL))+
labs(x="Yield in Kg/ha", y="ecd", title= "Empirical cumulative distribution of yield for Rice and Maize")+theme_light()+scale_fill_economist()+
theme(legend.title=element_blank(),plot.title = element_text(hjust = 0.5, size = 12, family = "Arial"))
gg3.4.2 Inpute Use
# Income by source
input <- list.data$InputExpenditureShare.dta
input1 <- input[,c(3,4,5,7)]
input_1 <- reshape2::melt(input1)
df <- input_1%>%
group_by(variable)%>%
dplyr::summarise(Avarage=mean(value, na.rm = T))
plot_ly(df, labels = ~variable, values = ~Avarage, type = 'pie',
textposition = 'inside',
textinfo = 'label+percent',
insidetextfont = list(color = '#FFFFFF'),
hoverinfo = 'text',
marker = list(colors = mycolor3,
line = list(color = '#FFFFFF', width = 1)),
#The 'pull' attribute can also be used to create space between the sectors
showlegend = FALSE) %>%
layout(title = 'Annual Input Expenditure by Catagory',
xaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE),
yaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE))3.4.3 Crop Production Challange
Although wetlands are fertile and provide consistent moisture for agricultural produc- tion, farmers face a number of challenges in wetland farming. From farmers point of view diseases are the main constraints of wetland utilization in the valley. Malaria and other water borne diseases are common in the Kilombero valley. Farmers also reported agronomic constraints (pests, weeds and excessive floods) as the main challenges for their crop production activity in the valley.
df <- list.data$TextQuestion.dta
df <- df[,c(14,15,16)]
df2 <- gather(df)
df3 <- df2%>%
group_by(key, value)%>%
dplyr::summarise(count = n())%>%
mutate(Percent=count/sum(count))
df3 <- na.omit(df3)
df3$key <- as.factor(df3$key)
levels(df3$key) <- c("First", "Second", "Third")
plot_ly(df3,x=~key, y = ~Percent, color = ~value) %>%
layout(title = "Main crop production challenges",
xaxis = list(title = "Challanges"),
yaxis = list(title = "Percent of Households")) 3.5 Market Participation
Farmers market different proportions of their crop for cash. Rice is usually prioritized both for its local consumption and income generating potential. 80 percent of surveyed households have sold the rice compared to only 28 percent of households reporting selling maize. The survey result shows on average 60 percent of the rice and maize cultivated is sold for cash and the remaining 40 percent retained for home consumption. Farmer commercialization index, which is composite index of farmers total crop sell to total crop cultivation, is 46 percent in the study area. The marketing channel is characterized by large number of small traders operating between the farmer and the rice mills or maize market located in Ifakara. The local traders buy small quantities directly from farmers and transport to mills where it is milled and the rice sold to inter-regional traders or local retailers or directly to consumers.
df <- list.data$combinedNew.dta
df<-df[,c(10,18)]
c <- ggplot(df, aes(df$FarmSize_Ha, df$HouseholdCommercializationIndex))+ stat_smooth()+labs(y="Commercialization Index", x="Farm size_ha") +
theme_light()
ggplotly(c)%>%
layout(title=" Locally Weighted Regression Between Farm Size and Commercialization Index", font=list(
family = "Arial",
size = 11,
color = '#0d8458'))3.7 Access to Infrastructure
The survey results indicate that among the evaluated services, tarmac road was a service located very far from most of the households dwellings than any other service. It was located at an average distance of 28.43 kilometers from the agricultural households dwellings. Other services and their respective average distances in kilometers from the dwellings were telephone center (13.43), health center (4), and river stream (2.6).
3.8 Shock and Responces
Across the valley farmers faced frequent risks and shocks to their agricultural produc- tion and their livelihood, crop pest and disease damage, including disease outbreaks, volatility in market prices and occurrence of extreme weather events. Farming in the Kilombero valley is often subject to environmental disturbances such as extreme weather events: drought, water logging, floods, untimely or uneven distribution of rain- fall, incidence of pest and diseases. Based on self reported challenges, in the past 5 years, Crop pests and disease have affected 74 percent of all farmers surveyed. Drought and flooding have also affected 56 percent and 40 percent of the households, respec- tively. The other risks to farmer livelihoods were related to high volatility in market prices for agricultural products, large increases in the prices of agricultural inputs (31 percent), low prices for their products (58 percent) and increase in the food price (38 percent). The survey result also shows that farmers are limited in terms of their coping strategy towards these shocks. Almost 50 percent of the households reported that they didn’t do anything to cope with the shock and 18 percent of the households reported they worked more to recover and 11 percent of them used their own saving. And 10 percent of the households got help from relatives and friends.
Shock Types
df <- list.data$Shock.dta
df2 <- df["shocktype"]
df3 <- df2%>%
group_by(shocktype)%>%
dplyr::summarise(count = n())%>%
mutate(Percent=(count/304)*100)
df3$shocktype <- factor(df3$shocktype, levels = df3$shocktype[order(df3$Percent, decreasing = TRUE)])
df3 <- round_df(df3, 3)
levels(df3$shocktype) <- c("Crop Pests<br>diseases","Change in <br> crop price","Floods","Droughts","Increase in<br> food prices", "input prices<br> Increase","Death of <br> hh member", "Crime","Input <br>Shortage", "Unsuccessful<br> investment","Serious<br> illness ","Loss of<br> land","Divorce","Loss of<br> job ","Other")
plot_ly(df3, x = ~shocktype, y = ~Percent, type = 'bar', text = ~round(Percent,2), textposition = 'outside',textfont = list(color = 'black', size=12),color =~shocktype, colors = "Set3", showlegend=F) %>%
layout(title = "Shocks Experianced",
xaxis = list(title = "", showticklabels = T,showline = T, zeroline = FALSE, showgrid = FALSE), yaxis = list(title = "Percent of Households")) Coping Mechanisims
df <- list.data$Shock.dta$cope1
df <- as.data.frame(df)
names(df)<- "Cope"
df2 <- df%>%
group_by(Cope)%>%
dplyr::summarise(count=n())%>%
mutate(percent=(count/sum(count))*100)
df3 <- remove_missing(round_df(df2))
df3 <- df3[which(df3$percent > 2),]
df3$Cope <- factor(df3$Cope)
levels(df3$Cope) <- c("Nothing", "Reduced<br>consumption","Got assistance <br> from relatives <br> or friends ", "Borrowed <br> from <br>others","Work more","Use savings" )
plot_ly(df3, x = ~Cope, y = ~percent, type = 'bar', text = ~percent, textposition = 'outside',textfont = list(color = 'black', size=12),color =~Cope, colors = "Paired" , showlegend=FALSE)%>%
layout(title="Coping Mechanism", xaxis=list(showticklabels = TRUE) )original.data <- list.data$Perception2.dta
original.data <- remove_missing(original.data)
myfactor <- function(x) {
factor(x, labels = c("Strongly Agree", "Agree", "Neutral", "Disagree", "Strongly Disagree") )
}
original.data2 <- lapply(original.data, myfactor)
original.data2 <- as.data.frame(original.data2)
names(original.data2) <- c(
"Compared to the past, <br> the use of the wetlands for crop production <br> in this area has increased",
"Compared to the past, the use of wetlands <br> for material collection has increased ",
"Compared to the past, the level of fertility <br> in the wetland has declined ",
"People in this community would support <br> efforts to conserve the wetland(s)",
"Compared to 10 years ago, the amount of water <br> in the wetlands in this area has declined ",
"Compared to the last 10 years,<br> fishing activities in the wetland have increased",
"People in this area feel that they own the wetland ",
"People in this area care about <br>public natural resources ",
"I have witnessed some form of conflict<br> over wetland resources in the past",
"Government officials are effective<br> in protecting the wetland",
"It is safe to leave my home unattended<br> because no one can steal anything",
"People in this area feel generally secure<br> when dealing with outsiders ",
"If I drop my wallet or purse somewhere within<br> this village, I am likely to get it back",
"People in this area have <br>strong traditional attachment to wetlands",
"Most of my neighbours have lost some assets <br> or livestock to thieves within the last five years",
"Farm sizes in my village have <br>declined compared to the past",
"The population in my village has increased<br> currently compared to 10 years ago",
"In the past 5 years,<br> I have noticed an increase in <br>the amount of income I generate from the Wetland",
"My family’s economic situation has <br>improved compared with 5 years ago"
)
original.data3 <- original.data2
df_summary <- likert(original.data3)
df_summary <- as.data.frame(df_summary$results)
df_summary_1 <- df_summary[c(1:6),]
df_summary_2 <- df_summary[c(7:12),]
df_summary_3 <- df_summary[c(13:19),]
# write.csv(df_summary_1, here::here("Data_csv","Perception1.csv"))
# write.csv(df_summary_2, here::here("Data_csv","Perception2.csv"))
# write.csv(df_summary_3, here::here("Data_csv","Perception3.csv"))3.9 Perception of Wetlands
Sustainable use of a common pool resource is related to the positive and negative opinions and perception that farmers have regarding its benefits, underlying pressure, social structure of the users to mention few. In order to understand opinions and perception of farmers in KVFP towards the wetland ecosystem, we documented a response of the farmers for a set of 19 questions. The questions are based on a typical likert scale with five point scales [strongly agree, Agree, neutral, disagree, and strongly disagree] used to tapping into the cognitive and effective components of their attitudes. The following charts shows the percentage of farmers expressing their agreement or disagreement on a symmetric agree-disagree scale for a series of statements while responding to a particular question. And the second chart presents the distribution of the responses for each question. For example, 81 percent of the farmers agree that people in the community would support efforts to conserve wetlands in KVFP. 80 percent of the farmers also agree the use of wetlands for crop production has increased over the years. Although almost all surveyed farmers also feel they own the wetland, only 25 percent of the farmers agree that they have strong traditional attachment to wetlands. 98 percent of the farmers also have the opinion that the population in their village has increased over time and 58 percent of the farmers have the opinion that farm sizes have declined compared to the past.
# density plot of the responces
data.orginal <- original.data2
data.orginal2 <- data.orginal[,c(1:6)]
data.orginal3 <- data.orginal[,c(7:12)]
data.orginal4 <- data.orginal[,c(13:19)]
Result <- likert(data.orginal)
Result2 <-likert(data.orginal2)
Result3 <-likert(data.orginal3)
Result4 <- likert(data.orginal4)
gg2<-plot(Result2,
type="density",
facet = TRUE,
bw = 0.5, mar = c(1, 1, 2, 1))
gg3<-plot(Result3,
type="density",
facet = TRUE,
bw = 0.5, mar = c(1, 1, 2, 1))
gg4<-plot(Result4,
type="density",
facet = TRUE,
bw = 0.5)Perception One
options(stringsAsFactors = FALSE)
data <- read.csv(here::here("Data_csv","Perception1.csv"))
data <- data[,c(2:7)]
data <- round_df(data)
names(data) <- c("y", "x1","x2","x3","x4", "x5")
y= data$y
x1 <- data$x1
x2 <- data$x2
x3 <- data$x3
x4 <- data$x4
x5 <- data$x5
top_labels <- c('Strongly<br>agree', 'Agree', 'Neutral', 'Disagree', 'Strongly<br>disagree')
p <- plot_ly(data, x = ~x1, y = ~y, type = 'bar', orientation = 'h', name="Strongly<br>agree",
marker = list(color = 'rgba(38, 24, 74, 0.8)',
line = list(color = 'rgb(248, 248, 249)', width = 1))) %>%
add_trace(x = ~x2, name="Agree", marker = list(color = 'rgba(71, 58, 131, 0.8)')) %>%
add_trace(x = ~x3,name="Neutral", marker = list(color = 'rgba(122, 120, 168, 0.8)')) %>%
add_trace(x = ~x4, name="Disagree", marker = list(color = 'rgba(164, 163, 204, 0.85)')) %>%
add_trace(x = ~x5, name="Strongly Disagree", marker = list(color = 'rgba(190, 192, 213, 1)')) %>%
layout(xaxis = list(title = "",
showgrid = FALSE,
showline = FALSE,
showticklabels = FALSE,
zeroline = FALSE,
domain = c(0.15, 1)),
yaxis = list(title = "",
showgrid = FALSE,
showline = FALSE,
showticklabels = FALSE,
zeroline = FALSE),
barmode = 'stack',
paper_bgcolor = 'rgb(248, 248, 255)', plot_bgcolor = 'rgb(248, 248, 255)',
margin = list(l = 150, r = 5, t = 140, b = 60),
showlegend = FALSE) %>%
# labeling the y-axis
add_annotations(xref = 'paper', yref = 'y', x = 0.14, y = y,
xanchor = 'right',
text = y,
font = list(family = 'Arial', size = 10,
color = 'rgb(67, 67, 67)'),
showarrow = FALSE, align = 'right') %>%
# labeling the percentages of each bar (x_axis)
add_annotations(xref = 'x', yref = 'y',
x = x1 / 2, y = y,
text = paste(data[,"x1"], '%'),
font = list(family = 'Arial', size = 10,
color = 'rgb(248, 248, 255)'),
showarrow = FALSE) %>%
add_annotations(xref = 'x', yref = 'y',
x = x1 + x2 / 2, y = y,
text = paste(data[,"x2"], '%'),
font = list(family = 'Arial', size = 10,
color = 'rgb(248, 248, 255)'),
showarrow = FALSE) %>%
add_annotations(xref = 'x', yref = 'y',
x = x1 + x2 + x3 / 2, y = y,
text = paste(data[,"x3"], '%'),
font = list(family = 'Arial', size = 10,
color = 'rgb(248, 248, 255)'),
showarrow = FALSE) %>%
add_annotations(xref = 'x', yref = 'y',
x = x1 + x2 + x3 + x4 / 2, y = y,
text = paste(data[,"x4"], '%'),
font = list(family = 'Arial', size = 10,
color = 'rgb(248, 248, 255)'),
showarrow = FALSE) %>%
add_annotations(xref = 'x', yref = 'y',
x = x1 + x2 + x3 + x4 + x5 / 2, y = y,
text = paste(data[,"x5"], '%'),
font = list(family = 'Arial', size = 10,
color = 'rgb(248, 248, 255)'),
showarrow = FALSE) %>%
# labeling the first Likert scale (on the top)
add_annotations(xref = 'x', yref = 'paper',
x = c(21 / 2, 21 + 30 / 2, 21 + 30 + 21 / 2, 21 + 30 + 21 + 16 / 2,
21 + 30 + 21 + 16 + 12 / 2),
y = 1.15,
text = top_labels,
font = list(family = 'Arial', size = 10,
color = 'rgb(67, 67, 67)'),
showarrow = FALSE)
pPerception Two
options(stringsAsFactors = FALSE)
data <- read.csv(here::here("Data_csv","Perception2.csv"))
data <- data[,c(2:7)]
data <- round_df(data)
names(data) <- c("y", "x1","x2","x3","x4", "x5")
y= data$y
x1 <- data$x1
x2 <- data$x2
x3 <- data$x3
x4 <- data$x4
x5 <- data$x5
top_labels <- c('Strongly<br>agree', 'Agree', 'Neutral', 'Disagree', 'Strongly<br>disagree')
p2 <- plot_ly(data, x = ~x1, y = ~y, type = 'bar', orientation = 'h', name="Strongly<br>agree",
marker = list(color = 'rgba(38, 24, 74, 0.8)',
line = list(color = 'rgb(248, 248, 249)', width = 1))) %>%
add_trace(x = ~x2, name="Agree", marker = list(color = 'rgba(71, 58, 131, 0.8)')) %>%
add_trace(x = ~x3,name="Neutral", marker = list(color = 'rgba(122, 120, 168, 0.8)')) %>%
add_trace(x = ~x4, name="Disagree", marker = list(color = 'rgba(164, 163, 204, 0.85)')) %>%
add_trace(x = ~x5, name="Strongly Disagree", marker = list(color = 'rgba(190, 192, 213, 1)')) %>%
layout(xaxis = list(title = "",
showgrid = FALSE,
showline = FALSE,
showticklabels = FALSE,
zeroline = FALSE,
domain = c(0.15, 1)),
yaxis = list(title = "",
showgrid = FALSE,
showline = FALSE,
showticklabels = FALSE,
zeroline = FALSE),
barmode = 'stack',
paper_bgcolor = 'rgb(248, 248, 255)', plot_bgcolor = 'rgb(248, 248, 255)',
margin = list(l = 150, r = 5, t = 140, b = 60),
showlegend = FALSE) %>%
# labeling the y-axis
add_annotations(xref = 'paper', yref = 'y', x = 0.14, y = y,
xanchor = 'right',
text = y,
font = list(family = 'Arial', size = 10,
color = 'rgb(67, 67, 67)'),
showarrow = FALSE, align = 'right') %>%
# labeling the percentages of each bar (x_axis)
add_annotations(xref = 'x', yref = 'y',
x = x1 / 2, y = y,
text = paste(data[,"x1"], '%'),
font = list(family = 'Arial', size = 10,
color = 'rgb(248, 248, 255)'),
showarrow = FALSE) %>%
add_annotations(xref = 'x', yref = 'y',
x = x1 + x2 / 2, y = y,
text = paste(data[,"x2"], '%'),
font = list(family = 'Arial', size = 10,
color = 'rgb(248, 248, 255)'),
showarrow = FALSE) %>%
add_annotations(xref = 'x', yref = 'y',
x = x1 + x2 + x3 / 2, y = y,
text = paste(data[,"x3"], '%'),
font = list(family = 'Arial', size = 10,
color = 'rgb(248, 248, 255)'),
showarrow = FALSE) %>%
add_annotations(xref = 'x', yref = 'y',
x = x1 + x2 + x3 + x4 / 2, y = y,
text = paste(data[,"x4"], '%'),
font = list(family = 'Arial', size = 10,
color = 'rgb(248, 248, 255)'),
showarrow = FALSE) %>%
add_annotations(xref = 'x', yref = 'y',
x = x1 + x2 + x3 + x4 + x5 / 2, y = y,
text = paste(data[,"x5"], '%'),
font = list(family = 'Arial', size = 10,
color = 'rgb(248, 248, 255)'),
showarrow = FALSE) %>%
# labeling the first Likert scale (on the top)
add_annotations(xref = 'x', yref = 'paper',
x = c(21 / 2, 21 + 30 / 2, 21 + 30 + 21 / 2, 21 + 30 + 21 + 16 / 2,
21 + 30 + 21 + 16 + 12 / 2),
y = 1.15,
text = top_labels,
font = list(family = 'Arial', size = 10,
color = 'rgb(67, 67, 67)'),
showarrow = FALSE)
p2Perception Three
options(stringsAsFactors = FALSE)
data <- read.csv(here::here("Data_csv","Perception3.csv"))
data <- data[,c(2:7)]
data <- round_df(data)
names(data) <- c("y", "x1","x2","x3","x4", "x5")
y= data$y
x1 <- data$x1
x2 <- data$x2
x3 <- data$x3
x4 <- data$x4
x5 <- data$x5
top_labels <- c('Strongly<br>agree', 'Agree', 'Neutral', 'Disagree', 'Strongly<br>disagree')
p3 <- plot_ly(data, x = ~x1, y = ~y, type = 'bar', orientation = 'h', name="Strongly<br>agree",
marker = list(color = 'rgba(38, 24, 74, 0.8)',
line = list(color = 'rgb(248, 248, 249)', width = 1))) %>%
add_trace(x = ~x2, name="Agree", marker = list(color = 'rgba(71, 58, 131, 0.8)')) %>%
add_trace(x = ~x3,name="Neutral", marker = list(color = 'rgba(122, 120, 168, 0.8)')) %>%
add_trace(x = ~x4, name="Disagree", marker = list(color = 'rgba(164, 163, 204, 0.85)')) %>%
add_trace(x = ~x5, name="Strongly Disagree", marker = list(color = 'rgba(190, 192, 213, 1)')) %>%
layout(xaxis = list(title = "",
showgrid = FALSE,
showline = FALSE,
showticklabels = FALSE,
zeroline = FALSE,
domain = c(0.15, 1)),
yaxis = list(title = "",
showgrid = FALSE,
showline = FALSE,
showticklabels = FALSE,
zeroline = FALSE),
barmode = 'stack',
paper_bgcolor = 'rgb(248, 248, 255)', plot_bgcolor = 'rgb(248, 248, 255)',
margin = list(l = 150, r = 5, t = 140, b = 60),
showlegend = FALSE) %>%
# labeling the y-axis
add_annotations(xref = 'paper', yref = 'y', x = 0.14, y = y,
xanchor = 'right',
text = y,
font = list(family = 'Arial', size = 10,
color = 'rgb(67, 67, 67)'),
showarrow = FALSE, align = 'right') %>%
# labeling the percentages of each bar (x_axis)
add_annotations(xref = 'x', yref = 'y',
x = x1 / 2, y = y,
text = paste(data[,"x1"], '%'),
font = list(family = 'Arial', size = 10,
color = 'rgb(248, 248, 255)'),
showarrow = FALSE) %>%
add_annotations(xref = 'x', yref = 'y',
x = x1 + x2 / 2, y = y,
text = paste(data[,"x2"], '%'),
font = list(family = 'Arial', size = 10,
color = 'rgb(248, 248, 255)'),
showarrow = FALSE) %>%
add_annotations(xref = 'x', yref = 'y',
x = x1 + x2 + x3 / 2, y = y,
text = paste(data[,"x3"], '%'),
font = list(family = 'Arial', size = 10,
color = 'rgb(248, 248, 255)'),
showarrow = FALSE) %>%
add_annotations(xref = 'x', yref = 'y',
x = x1 + x2 + x3 + x4 / 2, y = y,
text = paste(data[,"x4"], '%'),
font = list(family = 'Arial', size = 10,
color = 'rgb(248, 248, 255)'),
showarrow = FALSE) %>%
add_annotations(xref = 'x', yref = 'y',
x = x1 + x2 + x3 + x4 + x5 / 2, y = y,
text = paste(data[,"x5"], '%'),
font = list(family = 'Arial', size = 10,
color = 'rgb(248, 248, 255)'),
showarrow = FALSE) %>%
# labeling the first Likert scale (on the top)
add_annotations(xref = 'x', yref = 'paper',
x = c(21 / 2, 21 + 30 / 2, 21 + 30 + 21 / 2, 21 + 30 + 21 + 16 / 2,
21 + 30 + 21 + 16 + 12 / 2),
y = 1.15,
text = top_labels,
font = list(family = 'Arial', size = 10,
color = 'rgb(67, 67, 67)'),
showarrow = FALSE)
p34 Characterizing farmers in KVFP through Typology
To capture farmer heterogeneity and elicit the diversity of livelihoods and strategies, we created an attribute-based typology using a Statistical Non parametric Multivariate Analysis technique. Farmer typology research has become popular as a way of segmenting farmers into groups to assist in developing targeted programs. Using these approaches, survey data from farmers is collected and then clustered statistically from the data upwards to develop groupings. Emerging styles are grounded in the data rather than attempting to classify cases (farmers) into predetermined classes as with expert based farmer typology. Once farm types are identified, farmers can be discriminated by the characteristics of their households and of their farm management. In particular, types of farmers can be distinguished on the basis of their land use, income, their involvement in both on and off farm activities, market participation and access to infrastructure . Different types of farmers are expected to pursue different land use trajectories with important effects on the various ecosystem services that flow from the KVFP. These differences can result heterogeneity on uptake of alternative farming practices, future technologies and their adoption, and can be used to target interventions more effectively. The methodology in this study involves two steps:- Principal Component Analysis for reducing the dimensionality of the variables under consideration and Hierarchical Clustering for grouping farmers in to different segments.
Principal component analysis is a multivariate statistical technique that linearly transform an original set of variables in to smaller set of uncorrelated variables [called principal components] that account for decreasing proportions of the total variance of the original variables (Dunteman 1989). This phase can be considered as denouncing step, which can lead to a more stable clustering. According to Husson, Lê, and Pagès (2017) , PCA can be used as a method to separate signal and noise in the original data set with the first components extracting the essential information, whilst the last components representing the noise in the data. As such, applying the clustering on the PCA without the noise in the data, will lead a stable cluster of the data.
Hierarchical Clustering Clustering is a Multivariate statistical procedure that starts with a data set containing information about a sample of cases and attempts to reorganize these cases into relatively homogeneous groups based on the calculation of their Euclidean distance from the cluster centers. As a technique, it is used for exploring data sets to assess whether or not they can be summarized meaningfully in terms of a relatively small number of groups or clusters of cases or individuals which resemble each other and which are different in some respects from individuals in other clusters (Everitt et al. 2011). There are different approaches of clustering, Partitioning, Hierarchy algorithms, Density-based clustering, grid based clustering and model based clustering. In this study, we used hierarchical clustering. Hierarchical clustering is an alternative approach to partitioning clustering methods like k-means clustering for identifying groups in the data set. It does not require pre-specifying the number of clusters to be generated.
A total of 14 variables were selected based on farmers livelihood and land use choices. The selected variables include age of the household head, household size, access to market, commercialization index, land allocation to crops (Rice, maize and Vegetables), livestock ownership, access to water, access to off-farm on farm income etc. In order to reduce the effect of difference in unit of measurement the variables were scaled.
In the following sections, we will provide the procedure (and R code) that is followed in grouping the farmers. The clustering was implemented using R statistical software.
Main references for statistical method and software implementation are listed at the endof the notebook.
4.1 Principal Component Analysis (PCA)
As stated in the previous section, Principal component analysis is used to extract the important information from a multivariate data table and to express this information as a set of new variables called principal components.
Construction of PCA involves a number of steps. The first step involves, Center and scale with the scale() function (Everitt et al. 2011). Scaling the data values includes 1. Center: subtract from each value the mean of the corresponding vector 2. Scale: divide centered vector by their root mean square (rms): \[ x_{rms} = \sqrt[]{\frac{1}{n-1}\sum_{i=1}^{n}{x_{i}{^2}}} \] - Result: Mean = 0 and STDEV = 1
Once the data is scaled we run the PCA function from the factorminer package (Kassambara and Mundt 2017), and extract the eigenvalues, the loading and contribution of each variable to components. Table 6 shows the eigen values of the top 6 components. The eigenvalues represent the amount of variation retained by each PC. The first PC corresponds to the direction with the maximum amount of variation in the data set. The five components totally represent 68 percent of the variation in the original data set.
# run the PCA function on the orginal data from the factor miner package
pca1 <- PCA(data1, graph = FALSE)
# print the principal component analysis
# extract the eigen values
eigenvalues <- pca1$eig
#
DT::datatable(round_df(head(eigenvalues[, 1:2]),4), class = 'table-bordered', options = list(searching = FALSE,pageLength = 6,lengthMenu = c(5, 10, 15, 20), scrollX = T),caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 6: ', htmltools::em('The amount of variation retained by each PC [eigenvalues] .')
))g <- fviz_pca_var(pca1, col.var="cos2", title = "") +
scale_color_gradient2(low="white", mid="green", high="red", midpoint=0.5) + theme_minimal()
ggplotly(g)The following table and figures show the contributions of variables in accounting for the variability in a given principal component. For example Farm size in Ha account 34 percent of the first component followed by tropical livestock unit(TLU) accounting 27 percent of variability. On the other hand , due to orthogonality of the components different set of variables explain the variability of the second component with land share of rice and maze accounting 70 percent of the variability.
DT::datatable(round_df(head(pca1$var$contrib),4), class = 'table-bordered', options = list(searching = FALSE,pageLength = 5,lengthMenu = c(5, 10, 15, 20), scrollX = T),caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 7: ', htmltools::em('The contributions of variables in accounting for the variability in a given principal component[%].')
) )# fviz_pca_contrib(pca1, choice = "var", axes = 2)
# fviz_pca_contrib(pca1, choice = "var", axes = 3)
# fviz_pca_contrib(pca1, choice = "var", axes = 4)
# fviz_pca_contrib(pca1, choice = "var", axes = 5)
# #Total contribution on PC1 and PC2
# fviz_pca_contrib(pca1, choice = "var", axes = 1:5)
res.desc <- dimdesc(pca1, axes = c(1,5))
res.desc2 <- dimdesc(pca1)
# res.desc2$Dim.1
# res.desc2$Dim.2
# res.desc2$Dim.3
# res.desc2$Dim.4
# res.desc2$Dim.5Dimension All
Dimension 1
Dimension 2
Dimension 3
Dimension 4
4.2 Hierarchical clustering on Principal componenets (HCPC)
Hierarchical clustering algorithms – output a dendrogram, which is a tree representation of the data whose leaves are the input patterns and whose non-leaf nodes represent a hierarchy of groupings (Husson et al ,2011). There are different types of HC with agglomerative and divisive being the widely used ones. Agglomerative HC work bottom up, with each individual in a separate cluster; clusters are then iteratively merged, according to some criterion. Divisive algorithms start from the whole data set in a single cluster and work top down by iteratively dividing each cluster into two components until all clusters are singletons. Here we used Agglomerative HC with Ward’s criterion. This criterion decompose the total inertia (total variance) in between and within-group variance such that the growth of within-inertia is minimum (in other words minimizing the reduction of the between-inertia) at each step of the algorithm.. The total inertia can be decomposed:
\[ \sum_{k=1}^K\sum_{q=1}^Q\sum_{i=1}^{I_{q}}{(x_{iqk}-{\overline{x}}_{k})}^{2} = \sum_{k=1}^K\sum_{q=1}^Q I_{q}{(x_{qk}-{\overline{x}}_{k})}^{2}+\sum_{k=1}^K\sum_{q=1}^Q\sum_{i=1}^{I_{q}}{(x_{iqk}-{\overline{x}}_{qk})}^{2} \]
with \(x_{iqk}\) the value of the variable k for the individual i of the cluster q, \({\overline{x}}_{qk}\) the mean of the variable k for cluster q, \({\overline{x}}_{k}\) the overall mean of variable k and Iq the number of individuals in cluster q.
The hierarchy is represented by a dendrogram which is indexed by the gain of within-inertia.
# Compute HCPC
res.hcpc2 <- HCPC(pca1, graph = FALSE)
# Plot the dendrogram only
f<-fviz_dend(res.hcpc2, cex = 0.5, show_labels = FALSE, k_colors = c("#26547C","#EF476F","#FFD166"), rect = TRUE, rect_fill = TRUE, rect_border = c("#26547C","#EF476F","#FFD166"),lower_rect = -0.1, ggtheme = theme_grey())
ggplotly(f)%>% layout(showlegend=F)# Draw only the factor map
g <-fviz_cluster(res.hcpc2, geom = "point", title = "Factor Map", palette = "jco")+theme_minimal()
ggplotly(g)%>%
layout(showlegend=F)myData2 <- res.hcpc2$data.clust
levels(myData2$clust) <- c("Mono-crop Rice Producers","Diversifier", "Agro-Pastoralist")
# Variable describing clusters
type <- as.data.frame(res.hcpc2$desc.var$quanti.var)
type1 <- res.hcpc2$desc.var$quanti$`1`
type2 <- res.hcpc2$desc.var$quanti$`2`
type3 <- res.hcpc2$desc.var$quanti$`3`
DT::datatable(round_df(type,5), class = 'table-bordered', options = list(searching = FALSE,pageLength = 10,lengthMenu = c(5, 10, 15, 20), scrollX = T), caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 8: ', htmltools::em('Link between the cluster variable and the quantitative variables.')))Share of land allocated to maize and vegetable and percent hired and share of rice are most significantly associated with cluster two. Hence, we labeled the second cluster of farmers as “Diversifier” with their land allocated to maize, vegetables and rice(significantly lower than the over all average).
Finally the third cluster is associated with farm size, TLU, House hold size and percapita income. Given the mix of farming and livestock keeping these cluster of farmers, we labeled it as “Agro-Pastorals”. The agro -pastorals own relatively higher farm size, TLU household size and percapita income relative to their pears in the valley. And lower market participation (crop) and lower labor man-days per year per hectare.
DT::datatable(round_df(type1,5), class = 'table-bordered', options = list(searching = FALSE,pageLength = 10,lengthMenu = c(5, 10, 15, 20), scrollX = T), caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 9: ', htmltools::em('Description of cluster one by quantitative variables')))DT::datatable(round_df(type2,5), class = 'table-bordered', options = list(searching = FALSE,pageLength = 10,lengthMenu = c(5, 10, 15, 20), scrollX = T), caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 10: ', htmltools::em('Description of cluster two by quantitative variables.')))DT::datatable(round_df(type3,5), class = 'table-bordered', options = list(searching = FALSE,pageLength = 10,lengthMenu = c(5, 10, 15, 20), scrollX = T), caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 11: ', htmltools::em('Description of cluster three by quantitative variables.')))Based on the above labeling of the clusters, the following chart shows the proportion of each farm type in sampled farmers. The majority(65 percent) of the farmers are mono-crop rice producers , 28 percent of the farmers are diversifiers and the remaining 7 percent are agro-pastorals.
df2 <- myData2$clust
df2 <- as_data_frame(df2)
df <- df2%>%
group_by(value)%>%
dplyr::summarise(counts=n())%>%
mutate(prop = round(counts*100/sum(counts), 1),
lab.ypos = cumsum(prop) - 0.5*prop)
plot_ly(df, labels = ~value, values = ~prop, type = 'pie',
textposition = 'inside',
textinfo = 'label+percent',
insidetextfont = list(color = '#FFFFFF'),
hoverinfo = 'text',
marker = list(colors=mycolor3,
line = list(color = '#FFFFFF', width = 1)),
#The 'pull' attribute can also be used to create space between the sectors
showlegend = FALSE) %>%
layout(title = 'Proportion of farmer Types',
xaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE),
yaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE))From the box plots below one can observe the main differences in terms of different farm attributes important to understand farm management and land use trajectories.
4.2.1 Box plot for the variables and farm type
df_box <- myData2
levels(df_box$clust) <- c("Mono-crop Rice Producers","Diversifier", "Agro-Pastoralist")
my_comparisons <-
list(
c("Mono-crop Rice Producers", "Agro-Pastoralist"),
c("Mono-crop Rice Producers", "Diversifier"),
c("Agro-Pastoralist", "Diversifier")
)
g <- ggplot(df_box, aes(clust, FarmSize_Ha))+ stat_boxplot(geom = "errorbar", width = 0.2, linetype=3) + geom_boxplot(aes(fill = clust),
width =
0.4, alpha = 0.5
) + stat_compare_means(
comparisons = my_comparisons,
method = "t.test",
label = "p.signif"
) + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Farm Size by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
gg <- ggplot(df_box, aes(clust, ShareOfRice))+ stat_boxplot(geom = "errorbar", width = 0.2, linetype=3) + geom_boxplot(aes(fill = clust),
width =
0.4, alpha = 0.5
) + stat_compare_means(comparisons = my_comparisons, label = "p.signif") + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Share of land allocated to Rice by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
gg <- ggplot(df_box, aes(clust, ShareOfMaize))+ stat_boxplot(geom = "errorbar", width = 0.2, linetype=3) + geom_boxplot(aes(fill = clust),
width =
0.4, alpha = 0.5
) + stat_compare_means(comparisons = my_comparisons, label = "p.signif") + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Share of land allocated to Maize by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
gg <- ggplot(df_box, aes(clust, HouseholdCommercializationIndex))+ stat_boxplot(geom = "errorbar", width = 0.2, linetype=3) + geom_boxplot(aes(fill = clust),
width =
0.4, alpha = 0.5
) + stat_compare_means(comparisons = my_comparisons, label = "p.signif") + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Commercialization Index by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
gg <- ggplot(df_box, aes(clust, HHsize))+ stat_boxplot(geom = "errorbar", width = 0.2, linetype=3) + geom_boxplot(aes(fill = clust),
width =
0.4, alpha = 0.5
)+ stat_compare_means(comparisons = my_comparisons, label = "p.signif") + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Household Size by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
gg <- ggplot(df_box, aes(clust, PercentHired))+ stat_boxplot(geom = "errorbar", width = 0.2, linetype=3) + geom_boxplot(aes(fill = clust),
width =
0.4, alpha = 0.5
)+ stat_compare_means(comparisons = my_comparisons, label = "p.signif") + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Percent of Hired labor by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
gg <- ggplot(df_box[which(df_box$TLU < 20), ], aes(clust, TLU))+ stat_boxplot(geom = "errorbar", width = 0.2, linetype=3) + geom_boxplot(aes(fill = clust),
width =
0.4, alpha = 0.5
) + stat_compare_means(comparisons = my_comparisons, label = "p.signif") + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Tropical Livestock Unit by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
gg <- ggplot(df_box, aes(clust, Age))+ stat_boxplot(geom = "errorbar", width = 0.2, linetype=3) + geom_boxplot(aes(fill = clust),
width =
0.4, alpha = 0.5
) + stat_compare_means(comparisons = my_comparisons, label = "p.signif") + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Age of Household Head by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
gg <- ggplot(df_box[which(df_box$PerCapitaIncome<4000),], aes(clust, PerCapitaIncome))+ stat_boxplot(geom = "errorbar", width = 0.2, linetype=3) + geom_boxplot(aes(fill = clust),
width =
0.4, alpha = 0.5
) + stat_compare_means(comparisons = my_comparisons, label = "p.signif") + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Percapita Income by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
g5 Validation of Typology with the 2007 Agriculture Sample Survey
In order to check the validity and stability of the clusters identified above. we conducted the same clustering algorithm using the 2007 Agriculture sample survey of Tanzania. The data contains 810 observation across 54 villages in kilombero and Ulanga districts. The selection of the variables and algorithms are the same as the above analysis. However, the ASS data misses two important variables , Percapita income , amount of labour used in crop production.
5.1 PCA
res.pca <- PCA(dataTypology2[,-c(1,2,3)], scale.unit = TRUE, graph = FALSE)
# The amount of variation retained by each PC is called eigenvalues. #The first PC corresponds to the direction with the maximum amount of variation in the data set.
eigenvalues <- res.pca$eig
DT::datatable(round_df(head(eigenvalues[, 1:2]),3), class = 'table-bordered', options = list(searching = FALSE,pageLength = 10,lengthMenu = c(5, 10, 15, 20), scrollX = T), caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 12: ', htmltools::em('The amount of variation retained by each PC [eigenvalues].')
))#The correlation between a variable and a PC is called loading.
#The variables can be plotted as points in the component space using their loadings as coordinates.
DT::datatable(round_df(res.pca$var$coord,3), class = 'table-bordered', options = list(searching = FALSE,pageLength =10,lengthMenu = c(5, 10, 15, 20), scrollX = T),caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 13: ', htmltools::em('The correlation between a variable and a PC [loading].')
))#The squared loadings for variables are called cos2 ( = cor * cor = coord * coord).
#The cos2 values are used to estimate the quality of the representation
#The closer a variable is to the circle of correlations,
#the better its representation on the factor map
#(and the more important it is to interpret these components)
#Variables that are closed to the center of the plot are less important for the first components.
DT::datatable(round_df(res.pca$var$cos2,2), class = 'table-bordered', options = list(searching = FALSE,pageLength = 10,lengthMenu = c(5, 10, 15, 20), scrollX = T),caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 14: ', htmltools::em('squared loadings for variables [cos2] ')))fviz_pca_var(res.pca, col.var="cos2") +
scale_color_gradient2(low="white", mid="blue", high="red", midpoint=0.5) + theme_minimal()#The contributions of variables in accounting for the variability in a given principal component are
# (in percentage) : (variable.cos2 * 100) / (total cos2 of the component)
DT::datatable(round_df(res.pca$var$contrib,3), class = 'table-bordered', options = list(searching = FALSE,pageLength = 10,lengthMenu = c(5, 10, 15, 20), scrollX = T),caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 15: ', htmltools::em('The contributions of variables in accounting for the variability in a given principal component.')
))# p=fviz_pca_contrib(res.pca, choice = "var", axes = 1)
# ggplotly(p)
# p=fviz_pca_contrib(res.pca, choice = "var", axes = 2)
# ggplotly(p)
# p=fviz_pca_contrib(res.pca, choice = "var", axes = 3)
# ggplotly(p)
# p=fviz_pca_contrib(res.pca, choice = "var", axes = 4)
# ggplotly(p)
# p=fviz_pca_contrib(res.pca, choice = "var", axes = 5)
# ggplotly(p)
# #Total contribution on PC1 and PC2
# p=fviz_pca_contrib(res.pca, choice = "var", axes = 1:5)
res.desc <- dimdesc(res.pca, axes = c(1,5))
res.desc2 <- dimdesc(res.pca)Dimension All
Dimension 1
Dimension 2
Dimension 3
Dimension 4
Dimension 5
DT::datatable(round_df(res.desc2$Dim.1$quanti,5), class = 'table-bordered', options = list(searching = FALSE,pageLength = 10,lengthMenu = c(5, 10, 15, 20), scrollX = T), caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 16: ', htmltools::em('Contribution of Variables to first PC.')
) )DT::datatable(round_df(res.desc2$Dim.2$quanti,5), class = 'table-bordered', options = list(searching = FALSE,pageLength = 10,lengthMenu = c(5, 10, 15, 20), scrollX = T), caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 17: ', htmltools::em('Contribution of Variables to Second PC.')))DT::datatable(round_df(res.desc2$Dim.3$quanti,5), class = 'table-bordered', options = list(searching = FALSE,pageLength = 10,lengthMenu = c(5, 10, 15, 20), scrollX = T), caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 18: ', htmltools::em('Contribution of Variables to Third PC.')))5.2 HCPC
# Plot the dendrogram only
fviz_dend(res.hcpc2, show_labels = FALSE, k_colors = c("#26547C","#EF476F","#FFD166"), rect = TRUE, rect_fill = TRUE, rect_border = c("#26547C","#EF476F","#FFD166"),lower_rect = -0.1, ggtheme = theme_grey(base_family = "Arial"))#gplotly(f)%>%layout(showlegend=F)
# Draw only the factor map
#plot(res.hcpc2, choice ="map", draw.tree = FALSE, ind.names = FALSE, centers.plot = TRUE, title = "Factor Map")
p=fviz_cluster(res.hcpc2, geom = "point", main = "Factor Map", palette="jco")+theme(legend.position = "none")+ theme_grey()
ggplotly(p)%>%
layout(showlegend=F)# Variable describing clusters
type <- as.data.frame(res.hcpc2$desc.var$quanti.var)
type1 <- res.hcpc2$desc.var$quanti$`1`
type2 <- res.hcpc2$desc.var$quanti$`2`
type3 <- res.hcpc2$desc.var$quanti$`3`
DT::datatable(round_df(type,5), class = 'table-bordered', options = list(searching = FALSE,pageLength = 10,lengthMenu = c(5, 10, 15, 20), scrollX = T), caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 19: ', htmltools::em('Link between the cluster variable and the quantitative variables.')))DT::datatable(round_df(type1,5), class = 'table-bordered', options = list(searching = FALSE,pageLength = 10,lengthMenu = c(5, 10, 15, 20), scrollX = T), caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 20: ', htmltools::em('Description of cluster one by quantitative variables')))DT::datatable(round_df(type2,5), class = 'table-bordered', options = list(searching = FALSE,pageLength = 10,lengthMenu = c(5, 10, 15, 20), scrollX = T), caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 21: ', htmltools::em('Description of cluster two by quantitative variables.')))DT::datatable(round_df(type3,5), class = 'table-bordered', options = list(searching = FALSE,pageLength = 10,lengthMenu = c(5, 10, 15, 20), scrollX = T), caption = htmltools::tags$caption(
style = 'caption-side: top; text-align: center;',
'Table 22: ', htmltools::em('Description of cluster three by quantitative variables.')))myData2 <- res.hcpc2$data.clust
DataWithCluster <- dataTypology2
DataWithCluster$FarmType <- myData2$clust
DataWithCluster$FarmType <- factor(DataWithCluster$FarmType, labels = c( "Mono-crop rice producers","Agro-Pastoralist","Diversifier"))
df2 <- as.data.frame(DataWithCluster$FarmType)
names(df2)[1]<- "FarmType"
df <- df2%>%
group_by(FarmType)%>%
dplyr::summarise(counts=n())%>%
mutate(prop = round(counts*100/sum(counts), 1),
lab.ypos = cumsum(prop) - 0.5*prop)
plot_ly(df, labels = ~FarmType, values = ~prop, type = 'pie',
textposition = 'inside',
textinfo = 'label+percent',
insidetextfont = list(color = '#FFFFFF'),
hoverinfo = 'text',
marker=list(colors=mycolor3,
line = list(color = '#FFFFFF', width = 1)),
#The 'pull' attribute can also be used to create space between the sectors
showlegend = FALSE) %>%
layout(title = 'Proportion of farmer Types',
xaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE),
yaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE))5.2.1 Box plot for the variables and farm type
my_comparisons <-
list(
c("Mono-crop rice producers", "Agro-Pastoralist"),
c("Mono-crop rice producers", "Diversifier"),
c("Agro-Pastoralist", "Diversifier")
)
g <- ggplot(DataWithCluster, aes(FarmType, SizeOfCropLand)) + geom_boxplot(aes(fill = FarmType),
width =
0.4, alpha = 0.5
) + stat_boxplot(geom = "errorbar", width = 0.2) +
stat_compare_means(
comparisons = my_comparisons,
method = "t.test",
label = "p.signif"
) + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Farm Size by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
gg <- ggplot(DataWithCluster, aes(FarmType, ShareOfRice)) + geom_boxplot(aes(fill = FarmType),
width =
0.4, alpha = 0.5
)+ stat_boxplot(geom = "errorbar", width = 0.2) + stat_compare_means(comparisons = my_comparisons, label = "p.signif") + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Share of land allocated to Rice by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
gg <- ggplot(DataWithCluster, aes(FarmType, ShareOfMaize)) + geom_boxplot(aes(fill = FarmType),
width =
0.4, alpha = 0.5
)+ stat_boxplot(geom = "errorbar", width = 0.2) + stat_compare_means(comparisons = my_comparisons, label = "p.signif") + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Share of land allocated to Maize by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
gg <- ggplot(DataWithCluster, aes(FarmType, CommercializationIndex)) + geom_boxplot(aes(fill = FarmType),
width =
0.4, alpha = 0.5
)+ stat_boxplot(geom = "errorbar", width = 0.2) + stat_compare_means(comparisons = my_comparisons, label = "p.signif") + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Commercialization Index by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
gg <- ggplot(DataWithCluster, aes(FarmType, HHsize)) + geom_boxplot(aes(fill = FarmType),
width =
0.4, alpha = 0.5
)+ stat_boxplot(geom = "errorbar", width = 0.2) + stat_compare_means(comparisons = my_comparisons, label = "p.signif") + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Household Size by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
gg <- ggplot(DataWithCluster, aes(FarmType, DistanceFromRiverInKm)) + geom_boxplot(aes(fill = FarmType),
width =
0.4, alpha = 0.5
)+ stat_boxplot(geom = "errorbar", width = 0.2) + stat_compare_means(comparisons = my_comparisons, label = "p.signif") + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Distance from the River by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
gg <- ggplot(DataWithCluster[which(DataWithCluster$TLU < 20), ], aes(FarmType, TLU)) + geom_boxplot(aes(fill = FarmType),
width =
0.4, alpha = 0.5
)+ stat_boxplot(geom = "errorbar", width = 0.2) + stat_compare_means(comparisons = my_comparisons, label = "p.signif") + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Tropical Livestock Unit by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
gg <- ggplot(DataWithCluster, aes(FarmType, Age)) + geom_boxplot(aes(fill = FarmType),
width =
0.4, alpha = 0.5
)+ stat_boxplot(geom = "errorbar", width = 0.2) + stat_compare_means(comparisons = my_comparisons, label = "p.signif") + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Age of Household Head by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
gg <- ggplot(DataWithCluster, aes(FarmType, DistanceFromIfakara)) + geom_boxplot(aes(fill = FarmType),
width =
0.3, alpha = 0.8
)+ stat_boxplot(geom = "errorbar", width = 0.1) + stat_compare_means(comparisons = my_comparisons, label = "p.signif") + theme_fivethirtyeight( base_family = "Arial") + labs(
x = "",
title = "Distance from the market by Farmer Type",
caption = " Pairwise mean comparisons [ns: p > 0.05] [*: p <= 0.05]
[**: p <= 0.01] [***: p <= 0.001][****: p <= 0.0001]"
) + scale_fill_viridis(discrete = TRUE, option = "D") + theme(
legend.position =
"none",
plot.title = element_text(hjust = 0.5, size = 12)
)
g5.3 Session Info
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale: LC_COLLATE=English_Germany.1252, LC_CTYPE=English_Germany.1252, LC_MONETARY=English_Germany.1252, LC_NUMERIC=C and LC_TIME=English_Germany.1252
attached base packages:
- stats
- graphics
- grDevices
- utils
- datasets
- methods
- base
other attached packages:
- rgdal(v.1.3-4)
- sp(v.1.2-7)
- here(v.0.1)
- leaflet(v.2.0.2)
- viridis(v.0.5.1)
- viridisLite(v.0.3.0)
- knitr(v.1.20)
- MASS(v.7.3-50)
- ineq(v.0.2-13)
- likert(v.1.3.5)
- xtable(v.1.8-2)
- DT(v.0.4)
- reshape2(v.1.4.3)
- splines2(v.0.2.7)
- ggpubr(v.0.1.6)
- magrittr(v.1.5)
- kableExtra(v.0.9.0)
- summarytools(v.0.8.3)
- corrplot(v.0.84)
- clustertend(v.1.4)
- psych(v.1.8.4)
- stargazer(v.5.2.2)
- ade4(v.1.7-11)
- cluster(v.2.0.7-1)
- FactoMineR(v.1.41)
- factoextra(v.1.0.5)
- rhandsontable(v.0.3.6)
- haven(v.2.2.0)
- Hmisc(v.4.1-1)
- Formula(v.1.2-3)
- survival(v.2.44-1.1)
- lattice(v.0.20-38)
- foreign(v.0.8-71)
- tidyr(v.1.0.0)
- ggthemes(v.3.5.0)
- scales(v.1.0.0)
- RColorBrewer(v.1.1-2)
- plotly(v.4.8.0)
- dplyr(v.0.8.3)
- ggplot2(v.3.2.1)
loaded via a namespace (and not attached):
- colorspace(v.1.4-0)
- ggsignif(v.0.4.0)
- pryr(v.0.1.4)
- ellipsis(v.0.3.0)
- rprojroot(v.1.3-2)
- htmlTable(v.1.12)
- base64enc(v.0.1-3)
- rstudioapi(v.0.10)
- ggrepel(v.0.8.0)
- xml2(v.1.2.2)
- codetools(v.0.2-16)
- splines(v.3.6.1)
- leaps(v.3.0)
- mnormt(v.1.5-5)
- zeallot(v.0.1.0)
- jsonlite(v.1.6)
- Cairo(v.1.5-9)
- shiny(v.1.2.0)
- readr(v.1.3.1)
- compiler(v.3.6.1)
- httr(v.1.4.1)
- backports(v.1.1.2)
- assertthat(v.0.2.0)
- Matrix(v.1.2-17)
- lazyeval(v.0.2.1)
- later(v.0.8.0)
- acepack(v.1.4.1)
- htmltools(v.0.4.0)
- tools(v.3.6.1)
- gtable(v.0.2.0)
- glue(v.1.3.0)
- Rcpp(v.1.0.2)
- vctrs(v.0.2.0)
- nlme(v.3.1-140)
- crosstalk(v.1.0.0)
- xfun(v.0.10)
- stringr(v.1.4.0)
- rvest(v.0.3.5)
- mime(v.0.5)
- miniUI(v.0.1.1.1)
- lifecycle(v.0.1.0)
- hms(v.0.5.2)
- promises(v.1.0.1)
- parallel(v.3.6.1)
- yaml(v.2.2.0)
- gridExtra(v.2.3)
- pander(v.0.6.1)
- rpart(v.4.1-15)
- latticeExtra(v.0.6-28)
- stringi(v.1.1.7)
- highr(v.0.6)
- checkmate(v.1.8.5)
- rlang(v.0.4.2)
- pkgconfig(v.2.0.1)
- matrixStats(v.0.53.1)
- bitops(v.1.0-6)
- evaluate(v.0.10.1)
- purrr(v.0.3.3)
- labeling(v.0.3)
- rapportools(v.1.0)
- htmlwidgets(v.1.2)
- tidyselect(v.0.2.5)
- plyr(v.1.8.4)
- bookdown(v.0.7)
- R6(v.2.2.2)
- pillar(v.1.4.2)
- withr(v.2.1.2)
- scatterplot3d(v.0.3-41)
- RCurl(v.1.95-4.10)
- nnet(v.7.3-12)
- tibble(v.2.1.3)
- crayon(v.1.3.4)
- questionr(v.0.6.3)
- rmarkdown(v.1.10)
- grid(v.3.6.1)
- data.table(v.1.11.4)
- rmdformats(v.0.3.5)
- forcats(v.0.4.0)
- digest(v.0.6.17)
- flashClust(v.1.01-2)
- httpuv(v.1.4.5.1)
- munsell(v.0.5.0)
References
Arnold, Jeffrey B. 2018. Ggthemes: Extra Themes, Scales and Geoms for ’Ggplot2’. https://CRAN.R-project.org/package=ggthemes.
Bache, Stefan Milton, and Hadley Wickham. 2014. Magrittr: A Forward-Pipe Operator for R. https://CRAN.R-project.org/package=magrittr.
Bryer, Jason, and Kimberly Speerschneider. 2016. Likert: Analysis and Visualization Likert Items. https://CRAN.R-project.org/package=likert.
Chessel, D., A. B. Dufour, and J. Thioulouse. 2004. “The Ade4 Package-I- One-Table Methods.” R News 4: 5–10.
Comtois, Dominic. 2018. Summarytools: Tools to Quickly and Neatly Summarize Data. https://CRAN.R-project.org/package=summarytools.
Dahl, David B. 2016. Xtable: Export Tables to Latex or Html. https://CRAN.R-project.org/package=xtable.
Dray, S., and A. B. Dufour. 2007. “The Ade4 Package: Implementing the Duality Diagram for Ecologists.” Journal of Statistical Software 22 (4): 1–20.
Dray, S., A. B. Dufour, and D. Chessel. 2007. “The Ade4 Package-II: Two-Table and K-Table Methods.” R News 7 (2): 47–52.
Dunteman, George H. 1989. Principal Components Analysis. 69. Sage.
Everitt, Brian S, Sabine Landau, Morven Leese, and Daniel Stahl. 2011. “Hierarchical Clustering.” Cluster Analysis, 5th Edition, 71–110.
Harrell Jr, Frank E, with contributions from Charles Dupont, and many others. 2018. Hmisc: Harrell Miscellaneous. https://CRAN.R-project.org/package=Hmisc.
Hlavac, Marek. 2018. Stargazer: Well-Formatted Regression and Summary Statistics Tables. Bratislava, Slovakia: Central European Labour Studies Institute (CELSI). https://CRAN.R-project.org/package=stargazer.
Husson, François, Sébastien Lê, and Jérôme Pagès. 2017. Exploratory Multivariate Analysis by Example Using R. Chapman; Hall/CRC.
Kassambara, Alboukadel. 2017. Ggpubr: ’Ggplot2’ Based Publication Ready Plots. https://CRAN.R-project.org/package=ggpubr.
Kassambara, Alboukadel, and Fabian Mundt. 2017. Factoextra: Extract and Visualize the Results of Multivariate Data Analyses. https://CRAN.R-project.org/package=factoextra.
Lê, Sébastien, Julie Josse, and François Husson. 2008. “FactoMineR: A Package for Multivariate Analysis.” Journal of Statistical Software 25 (1): 1–18. https://doi.org/10.18637/jss.v025.i01.
Maechler, Martin, Peter Rousseeuw, Anja Struyf, Mia Hubert, and Kurt Hornik. 2018. Cluster: Cluster Analysis Basics and Extensions.
Neuwirth, Erich. 2014. RColorBrewer: ColorBrewer Palettes. https://CRAN.R-project.org/package=RColorBrewer.
Owen, Jonathan. 2018. Rhandsontable: Interface to the ’Handsontable.js’ Library. https://CRAN.R-project.org/package=rhandsontable.
R Core Team. 2017. Foreign: Read Data Stored by ’Minitab’, ’S’, ’Sas’, ’Spss’, ’Stata’, ’Systat’, ’Weka’, ’dBase’, ... https://CRAN.R-project.org/package=foreign.
Revelle, William. 2018. Psych: Procedures for Psychological, Psychometric, and Personality Research. Evanston, Illinois: Northwestern University. https://CRAN.R-project.org/package=psych.
Sarkar, Deepayan. 2008. Lattice: Multivariate Data Visualization with R. New York: Springer. http://lmdvr.r-forge.r-project.org.
Sievert, Carson, Chris Parmer, Toby Hocking, Scott Chamberlain, Karthik Ram, Marianne Corvellec, and Pedro Despouy. 2017. Plotly: Create Interactive Web Graphics via ’Plotly.js’. https://CRAN.R-project.org/package=plotly.
Terry M. Therneau, and Patricia M. Grambsch. 2000. Modeling Survival Data: Extending the Cox Model. New York: Springer.
Therneau, Terry M. 2015. A Package for Survival Analysis in S. https://CRAN.R-project.org/package=survival.
Venables, W. N., and B. D. Ripley. 2002. Modern Applied Statistics with S. Fourth. New York: Springer. http://www.stats.ox.ac.uk/pub/MASS4.
Wang, Wenjie, and Jun Yan. 2017. splines2: Regression Spline Functions and Classes. https://CRAN.R-project.org/package=splines2.
Wei, Taiyun, and Viliam Simko. 2017. R Package "Corrplot": Visualization of a Correlation Matrix. https://github.com/taiyun/corrplot.
Wickham, Hadley. 2007. “Reshaping Data with the reshape Package.” Journal of Statistical Software 21 (12): 1–20. http://www.jstatsoft.org/v21/i12/.
———. 2009. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. http://ggplot2.org.
———. 2017. Scales: Scale Functions for Visualization. https://CRAN.R-project.org/package=scales.
Wickham, Hadley, Romain Fran?ois, Lionel Henry, and Kirill M?ller. 2018. Dplyr: A Grammar of Data Manipulation. https://CRAN.R-project.org/package=dplyr.
Wickham, Hadley, and Lionel Henry. 2018. Tidyr: Easily Tidy Data with ’Spread()’ and ’Gather()’ Functions. https://CRAN.R-project.org/package=tidyr.
Wickham, Hadley, and Evan Miller. 2018. Haven: Import and Export ’Spss’, ’Stata’ and ’Sas’ Files. https://CRAN.R-project.org/package=haven.
Xie, Yihui. 2014. “Knitr: A Comprehensive Tool for Reproducible Research in R.” In Implementing Reproducible Computational Research, edited by Victoria Stodden, Friedrich Leisch, and Roger D. Peng. Chapman; Hall/CRC. http://www.crcpress.com/product/isbn/9781466561595.
———. 2015. Dynamic Documents with R and Knitr. 2nd ed. Boca Raton, Florida: Chapman; Hall/CRC. https://yihui.name/knitr/.
———. 2018a. DT: A Wrapper of the Javascript Library ’Datatables’. https://CRAN.R-project.org/package=DT.
———. 2018b. Knitr: A General-Purpose Package for Dynamic Report Generation in R. https://yihui.name/knitr/.
YiLan, Luo, and Zeng RuTong. 2015. Clustertend: Check the Clustering Tendency. https://CRAN.R-project.org/package=clustertend.
Zeileis, Achim. 2014. Ineq: Measuring Inequality, Concentration, and Poverty. https://CRAN.R-project.org/package=ineq.
Zeileis, Achim, and Yves Croissant. 2010. “Extended Model Formulas in R: Multiple Parts and Multiple Responses.” Journal of Statistical Software 34 (1): 1–13. https://doi.org/10.18637/jss.v034.i01.
Zhu, Hao. n.d. KableExtra: Construct Complex Table with ’Kable’ and Pipe Syntax.
3.6 Social Network and Institution
To help them make the most of their farming decision, farmers need access to a range of information that can help them decide on production, technology, weather, marketing etc saving them time and money. Farmers use different sources to access information on crop production, market and government policy. For information related to production and extension service, relatives, friends or neighbors were the primary source identified by 42 percent of respondents in the valley and 69.9 percent and 23 percent and 21 percent identified extension officer and radio as source of information respectively. For information related to market, farmers identified their social tie and radio as main sources. Radio is the main source of information for new or change in government policy.